3,175 research outputs found
Change-point model on nonhomogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing
We propose a flexible change-point model for inhomogeneous Poisson Processes,
which arise naturally from next-generation DNA sequencing, and derive score and
generalized likelihood statistics for shifts in intensity functions. We
construct a modified Bayesian information criterion (mBIC) to guide model
selection, and point-wise approximate Bayesian confidence intervals for
assessing the confidence in the segmentation. The model is applied to DNA Copy
Number profiling with sequencing data and evaluated on simulated spike-in and
real data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS517 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Detecting simultaneous variant intervals in aligned sequences
Given a set of aligned sequences of independent noisy observations, we are
concerned with detecting intervals where the mean values of the observations
change simultaneously in a subset of the sequences. The intervals of changed
means are typically short relative to the length of the sequences, the subset
where the change occurs, the "carriers," can be relatively small, and the sizes
of the changes can vary from one sequence to another. This problem is motivated
by the scientific problem of detecting inherited copy number variants in
aligned DNA samples. We suggest a statistic based on the assumption that for
any given interval of changed means there is a given fraction of samples that
carry the change. We derive an analytic approximation for the false positive
error probability of a scan, which is shown by simulations to be reasonably
accurate. We show that the new method usually improves on methods that analyze
a single sample at a time and on our earlier multi-sample method, which is most
efficient when the carriers form a large fraction of the set of sequences. The
proposed procedure is also shown to be robust with respect to the assumed
fraction of carriers of the changes.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS400 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Detecting mutations in mixed sample sequencing data using empirical Bayes
We develop statistically based methods to detect single nucleotide DNA
mutations in next generation sequencing data. Sequencing generates counts of
the number of times each base was observed at hundreds of thousands to billions
of genome positions in each sample. Using these counts to detect mutations is
challenging because mutations may have very low prevalence and sequencing error
rates vary dramatically by genome position. The discreteness of sequencing data
also creates a difficult multiple testing problem: current false discovery rate
methods are designed for continuous data, and work poorly, if at all, on
discrete data. We show that a simple randomization technique lets us use
continuous false discovery rate methods on discrete data. Our approach is a
useful way to estimate false discovery rates for any collection of discrete
test statistics, and is hence not limited to sequencing data. We then use an
empirical Bayes model to capture different sources of variation in sequencing
error rates. The resulting method outperforms existing detection approaches on
example data sets.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS538 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Hoxb2 and Hoxb4 Act Together to Specify Ventral Body Wall Formation
AbstractThree different alleles of the Hoxb4 locus were generated by gene targeting in mice. Two alleles contain insertions of a selectable marker in the first exon in either orientation, and, in the third, the selectable marker was removed, resulting in premature termination of the protein. Presence and orientation of the selectable marker correlated with the severity of the phenotype, indicating that the selectable marker induces cis effects on neighboring genes that influence the phenotype. Homozygous mutants of all alleles had cervical skeletal defects similar to those previously reported for Hoxb4 mutant mice. In the most severe allele, Hoxb4PolII, homozygous mutants died eitherin utero at approximately E15.5 or immediately after birth, with a severe defect in ventral body wall formation. Analysis of embryos showed thinning of the primary ventral body wall in mutants relative to control animals at E11.5, before secondary body wall formation. Prior to this defect, both Alx3 and Alx4 were specifically down regulated in the most ventral part of the primary body wall in Hoxb4PolII mutants. Hoxb4loxp mutants in which theneo gene has been removed did not have body wall or sternum defects. In contrast, both the Hoxb4PolII and the previously described Hoxb2PolII alleles that have body wall defects have been shown to disrupt the expression of bothHoxb2 and Hoxb4 in cell types that contribute to body wall formation. Our results are consistent with a model in which defects in ventral body wall formation require the simultaneous loss of at least Hoxb2 and Hoxb4, and may involve Alx3 and Alx4
- …